Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error, rather than warn, once a number of invalid path operators are encountered in EvaluatorPreprocessor.read (bug 1443140) #9838

Merged
merged 1 commit into from
Jun 28, 2018

Conversation

Snuffleupagus
Copy link
Collaborator

Incomplete path operators, in particular, can result in fairly chaotic rendering artifacts, as can be observed on page four of the referenced PDF file.

The initial (naive) solution that was attempted, was to simply throw a FormatError as soon as any invalid (i.e. too short) operator was found and rely on the existing ignoreErrors code-paths. However, doing so would have caused regressions in some files; see the existing issue2391-1 test-case, which was promoted to an eq test to help prevent future bugs.
Hence this patch, which adds special handling for invalid path operators since those may cause quite bad rendering artifacts.

You could, in all fairness, argue that the patch is a handwavy solution and I wouldn't object. However, given that this only concerns corrupt PDF files, the way that PDF viewers (PDF.js included) try to gracefully deal with those could probably be described as a best-effort solution anyway.

This patch also adjusts the existing warn/info messages to print the command name according to the PDF specification, rather than an internal PDF.js enumeration value. The former should be much more useful for debugging purposes.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1443140.

@Snuffleupagus Snuffleupagus force-pushed the invalid-path-OPS branch 2 times, most recently from 97bac55 to 56fcbb7 Compare June 24, 2018 14:02
…encountered in `EvaluatorPreprocessor.read` (bug 1443140)

Incomplete path operators, in particular, can result in fairly chaotic rendering artifacts, as can be observed on page four of the referenced PDF file.

The initial (naive) solution that was attempted, was to simply throw a `FormatError` as soon as any invalid (i.e. too short) operator was found and rely on the existing `ignoreErrors` code-paths. However, doing so would have caused regressions in some files; see the existing `issue2391-1` test-case, which was promoted to an `eq` test to help prevent future bugs.
Hence this patch, which adds special handling for invalid path operators since those may cause quite bad rendering artifacts.

You could, in all fairness, argue that the patch is a handwavy solution and I wouldn't object. However, given that this only concerns *corrupt* PDF files, the way that PDF viewers (PDF.js included) try to gracefully deal with those could probably be described as a best-effort solution anyway.

This patch also adjusts the existing `warn`/`info` messages to print the command name according to the PDF specification, rather than an internal PDF.js enumeration value. The former should be much more useful for debugging purposes.

Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1443140.
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/a0e954a9368ca7c/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.215.176.217:8877/5f5c78b045a5b89/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://54.215.176.217:8877/5f5c78b045a5b89/output.txt

Total script time: 23.47 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/a0e954a9368ca7c/output.txt

Total script time: 38.61 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@timvandermeij
Copy link
Contributor

I think it's good to have a fix for this, but I can't really comment on the heuristic itself since I'm not really familiar with this part of the codebase. @brendandahl or @yurydelendik Could you perhaps take a look here to see if the heuristic is OK?

@brendandahl brendandahl self-assigned this Jun 28, 2018
@brendandahl
Copy link
Contributor

/botio makeref

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_makeref from @brendandahl received. Current queue size: 0

Live output at: http://54.67.70.0:8877/1d0b8bdef0975f3/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_makeref from @brendandahl received. Current queue size: 0

Live output at: http://54.215.176.217:8877/b9d09e49cddfcd8/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://54.215.176.217:8877/b9d09e49cddfcd8/output.txt

Total script time: 21.36 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/1d0b8bdef0975f3/output.txt

Total script time: 35.74 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@timvandermeij timvandermeij merged commit 6fa2c77 into mozilla:master Jun 28, 2018
@timvandermeij
Copy link
Contributor

Thank you!

@Snuffleupagus Snuffleupagus deleted the invalid-path-OPS branch June 28, 2018 21:24
movsb pushed a commit to movsb/pdf.js that referenced this pull request Jul 14, 2018
Error, rather than warn, once a number of invalid path operators are encountered in `EvaluatorPreprocessor.read` (bug 1443140)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants